Data Placement in Parallel Database Systems
نویسندگان
چکیده
The way in which data is distributed across the processing elements of a parallel shared-nothing architecture can have a signiicant eeect on the performance of a parallel DBMS. Data placement strategies provide a mechanical approach to determining a data distribution which will provide good performance. However, there is considerable variation in the results produced by diierent strategies and no simple way of determining which strategy will provide the best results for any particular database application. This paper considers ve diierent data placement strategies and illustrates some of the problems associated with the placement of data by studying the sensitivity of the results produced by these diierent strategies to the changes in a number of environmental factors, such as the number of processing elements participating in database activities and the size of database. The study was conducted by using an analytical performance estimator for parallel database systems, in the context of a shared-nothing parallel database system using the transaction processing benchmark TPC-C. It provides some insight into the eeect which diierent types of strategies can have on the overall performance of the system.
منابع مشابه
Data Placement in a Shared - Nothing Parallel Deductive Database
Until recently most research into parallel databases has focussed on relational database systems. Nevertheless, there is growing interest in more powerful alternative systems such as deductive databases. Several rule handling strategies have been developed to incorporate deductive capabilities into parallel database systems. However, in a shared-nothing environment, the performance of a rule ha...
متن کاملThe Impact of Data Placement Strategies on Reorganization Costs in Parallel Databases
In this paper, we study the data placement problem from a reorganization point of view. EEective placement of the declustered fragments of a relation is crucial to the performance of parallel database systems having multiple disks. Given the dynamic nature of database systems, the optimal placement of fragments will change over time and this will necessitate a reorganization in order to maintai...
متن کاملData Placement and Query Processing Based on RPE Parallelisms
The basic idea behind parallel database systems is to perform operations in parallel to reduce the response time and improve the system throughput. Data placement is a key factor on the performance of parallel database systems. This paper proposes two data partition strategies to decluster XML documents with very large size, Path Schema based Path Instance Balancing (PSPIB) strategy, in which a...
متن کاملDatabase Placement on Large-Scale Systems
Large-scale systems such as Grids offer infrastructures for both data distribution and parallel processing. The use of Grid infrastructures is a more recent issue that is already impacting the Distributed Database Management System industry. In DBMS, distributed query processing has emerged as a fundamental technique for ensuring high performance in distributed databases. Database placement is ...
متن کاملOn Flexible Allocation of Index and Temporary Data in Parallel Database Systems
Data placement is a key factor for high performance database systems. This is particularly true for parallel database systems where data allocation must support both I/O parallelism and processing parallelism within complex queries and between independent queries and transactions. Determining an effective data placement is a complex administration problem depending on many parameters including ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996